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Abstract 

Steepest  feasible  descent  methods  for  inequality  constrained  optimization  problems 
have  commonly  been  plagued  by  short  steps.  The  consequence  of  taking  short  steps 
is  slow  convergence  or  even  convergence  to  non-station  ary  points  (zigzagging).  In 
linear  programming,  both  the  projective  algorithm  of  Karmarkar  (1984)  and  its  affine- 
variant,  originally  proposed  by  Dikin  (1967),  can  be  viewed  as  steepest  feasible  descent 
methods.  However,  both  of  these  algorithms  have  been  demonstrated  to  be  effective 
and  seem  to  have  overcome  the  problem  of  short  steps.  These  algorithms  share  a 
common  norm.  It  is  this  choice  of  norm,  in  the  context  of  steepest  feasible  descent, 
that  we  refer  to  as  the  Dikin-Karmarkar  Principle. 

This  research  develops  mathematical  theory  to  quantify  the  short  step  behavior  of 
Euclidean  norm  steepest  feasible  descent  methods  and  the  avoidance  of  short  steps  for 
steepest  feasible  descent  with  respect  to  the  Dikin-Karmarkar  norm.  While  the  theory 
is  developed  for  linear  programming  problems  with  only  nonnegativity  constraints  on 
the  variables,  our  numerical  experimentation  demonstrates  that  this  behavior  occurs 
for  the  more  general  linear  program  with  equality  constraints  added.  Our  numerical 
results  also  suggest  that  taking  longer  steps  is  not  sufficient  to  ensure  the  efficiency  of 
a  steepest  feasible  descent  algorithm.  The  uniform  way  in  which  the  Dikin-Karmarkar 
norm  treats  every  boundary  is  important  in  obtaining  satisfactory  convergence. 
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Chapter  1 

INTRODUCTION 


The  announcement  of  a  practical,  highly-efficient  polynomial-time  algorithm  for  linear 
programming  by  Karmarkar  [23]  in  1984  created  much  excitement  in  the  mathematical 
community.  This  single  algorithm  sparked  a  huge  amount  of  related  research — to 
verify  numerical  claims,  to  modify  and  extend  the  algorithm,  to  develop  new  interior 
point  methods  (both  for  linear  and  nonlinear  programming).  Soon  after  Karmarkar’s 
projective  algorithm  was  published,  its  affine-scaling  variant  was  proposed  by  several 
researchers  (for  example,  Barnes  [5]  and  Vanderbei,  Meketon,  and  Freedman[40]).  It 
was  later  learned  that  the  affine-scaling  variant  had  originally  been  introduced  by 
Dikin  [13]  in  1967. 

The  algorithms  proposed  both  by  Dikin  and  Karmarkar  both  solve  subproblems 
at  each  iteration  which  produce  steepest  feasible  descent  directions  with  respect  to 
a  common,  well-chosen  norm.  However,  steepest  feasible  descent  methods  have  been 
known  to  produce  short  steps  which  may  result  in  very  slow  convergence  or  con¬ 
vergence  to  nonstationary  points.  The  norm  used  by  Dikin  and  Karmarkar  could 
be  considered  an  optimal  choice  for  steepest  feasible  descent  applied  to  linear  pro¬ 
gramming  since  global  convergence  properties  have  been  proven  and  good  practical 
results  have  been  demonstrated.  It  is  this  choice  of  norm — in  the  context  of  solving 
problems  with  nonnegativity  constraints — that  we  refer  to  as  the  Dikin-Karmarkar 
Principle.  Steepest  feasible  descent  with  respect  to  this  norm  has  the  surprising  prop¬ 
erty  that  the  steps  are  bounded  away  from  short  steps  as  the  boundary  is  neared, 
unlike  Euclidean  norm  steepest  feasible  descent  which  virtually  assures  that  as  the 
solution  is  approached  short  steps  will  be  taken. 

This  thesis  is  organized  in  the  following  manner.  Chapter  2  contains  a  descrip¬ 
tion  of  steepest  descent  methods  for  unconstrained  and  constrained  optimization  and 
an  historical  perspective.  Chapter  3  points  out  the  difficulty  that  steepest  feasible 
descent  methods  encounter  because  of  short  steps — the  so-called  “zigzagging”  phe¬ 
nomenon.  Chapter  4  presents  Karmarkar’s  projective  algorithm  and  its  affine-scaling 
variant.  Dikin’s  algorithm  is  also  presented.  The  relation  of  both  the  projective  al- 
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gorithm  and  the  affine-scaling  algorithm  to  steepest  descent  is  discussed.  Chapter  5 
will  discuss  the  Dikin- Ivarmarkar  norm  along  with  a  geometric  interpretation.  The 
observations  concerning  the  behavior  of  steepest  feasible  descent  with  respect  to  the 
Euclidean  norm  versus  the  behavior  of  steepest  feasible  descent  with  respect  to  the 
Dikin-Karmarkar  norm  which  motivated  this  research  are  discussed.  Chapter  6  con¬ 
tains  the  theoretical  results  of  this  research.  We  concern  ourselves  with  the  special 
linear  program  in  which  the  only  constraints  are  nonnegativity  constraints  on  the 
variables.  We  show  that  as  the  boundary  is  approached,  the  steepest  feasible  de¬ 
scent  step  with  respect  to  the  Euclidean  norm  must  become  progressively  shorter  and 
in  fact,  asymptotically  approaches  the  shortest  possible  step.  Whereas,  for  steepest 
feasible  descent  with  respect  to  the  Dikin-Karmarkar  norm,  asymptotically,  as  the 
boundary  is  approached,  the  step  is  bounded  away  from  that  shortest  step.  This  be¬ 
havior  is  demonstrated  numerically,  in  the  more  general  linear  programming  problem 
with  linear  equality  constraints,  in  Chapter  7.  Chapter  8  gives  some  final  remarks 
and  observations. 
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Chapter  2 

STEEPEST  DESCENT 

2.1  Unconstrained  Minimization 

We  begin  by  considering  the  unconstrained  minimization  problem 

minimize  f(x),  (2-1) 

where  /  :  lRn  — +  IR  is  differentiable. 

A  natural  requirement  for  an  iterative  method  to  solve  problem  (2.1)  is  that  the 
objective  function  value  decrease  at  each  iteration,  i.e.  f(x  +  ad)  <  /(x),  for  some 
a  >  0.  To  obtain  decrease,  an  obvious  choice  for  d  is  a  vector  that  gives  the  greatest 
local  decrease  in  the  objective  function  /.  In  other  words,  we  ask  for  a  vector  that 
minimizes  V/(x)  rd  with  respect  to  d.  Clearly,  to  make  this  minimization  well- 
defined,  we  must  impose  some  type  of  normalization  on  the  direction  vector  d.  This 
notion  is  formalized  in  the  following  definition. 

Definition  2.1  (Steepest  Descent  Direction)  By  a  steepest  descent  di¬ 
rection  for  /  at  x  ,  with  respect  to  a  given  norm,  ||  •  ||,  we  mean  any  d 
that  solves 

minimize  V/(x)  Td  ^  2) 

subject  to  ||  d  ||  <  8, 

for  some  <5  >  0. 

Since  {  d  j  ||  d  ||  <  6  }  is  a  compact  set,  a  solution  to  Problem  (2.2)  exists,  though  it 
may  not  be  unique. 

Clearly,  the  solutions  to  (2.2)  depend  on  the  choice  of  norm.  When  the  norm  is  a 
weighted  Euclidean  norm,  i.e. 

||  •  \\w  =  IIW-1  •  ||2,  (2.3) 

where  W  €  JRn  x  n  is  a  symmetric,  positive  definite  matrix,  then  the  steepest  descent 
direction,  with  respect  to  the  W-norm  (2.3),  is  unique  (for  a  given  <5  >  0)  and  it  is  a 
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positive  scalar  multiple  of 

-  W2V/(x).  (2.4) 

For  W  =  I ,  the  norm  is  the  Euclidean  norm,  and  the  negative  gradient  is  a  direction 
of  steepest  descent. 

We  formally  define  a  method  of  steepest  descent  as  follows: 

Definition  2.2  (Method  of  Stee-pest  Descent)  By  a  method  of  steepest 
descent  for  problem  (2.1),  we  mean  any  iterative  method  of  the  form, 

xk+1  =  xk  +  ak  dk,  ak  >  0, 

in  which  dk  is  steepest  descent  direction  for  /  at  xk  as  described  in 
Definition  2.1. 

We  refer  to  a  problem  of  the  form  (2.2)  as  a  steepest  descent  subproblem  for  prob¬ 
lem  (2.1). 

2.2  Linearly  Constrained  Minimization 

Consider  the  optimization  problem  with  linear  equality  constraints  and  nonnegativity 
constraints  on  the  variables: 

minimize  /(x) 

subject  to  Ax  =  b  (2.5) 

x  >  0, 

where  /  :  IRn  -*■  JR,  x  €  2FT,  b  e  IRm ,  and  A  €  2Rmxn.  We  say  that  x  is  strictly 
feasible  for  problem  (2.5)  if  Ax  =  b  and  x  >  0. 

The  concept  of  steepest  descent  is  generalized  to  the  linearly  constrained  prob¬ 
lem  (2.5)  as  follows: 

Definition  2.3  (Steepest  Feasible  Descent  Direction)  By  a  steepest  fea¬ 
sible  descent  direction  for  /  at  x,  with  respect  to  a  given  norm,  ||  •  ||,  we 
mean  any  d  that  solves 

minimize  V/(x)Td 
subject  to  Ad  =  0 

\\d  ||  <  <5, 

for  x,  a  strictly  feasible  point  for  problem  (2.5),  and  some  8  >  0. 


(2.6) 
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REMARK:  Any  direction  that  satisfies  Definition  2.3  is  a  feasible  direction  as  de 
scribed  by  Zoutendijk  [42].  He  described  a  class  of  solution  methods  for  constrained 
minimization,  so-called  methods  of  feasible  directions.  In  this  class  of  iterative  meth¬ 
ods,  the  starting  point  is  feasible  and  all  the  iterates  remain  feasible. 

Again,  the  solutions  to  problem  (2.6)  depend  on  the  norm.  The  steepest  feasible 
descent  direction  with  respect  to  the  W-norm,(2.3),  is  simply  the  projection  onto 
the  null  space  of  A,  in  the  W-norm,  of  the  steepest  descent  direction  for  the  un¬ 
constrained  problem.  In  particular,  the  steepest  feasible  descent  direction  is  given 

by 

-W2[I-  At{A  W2At)~1A  W2]  V/(x).  (2.7) 

Note  that  in  the  case  where  we  are  minimizing  /  with  only  nonnegativity  constraints 
on  the  variables,  the  steepest  feasible  descent  direction,  with  respect  to  the  VH-norm, 
reduces  to 

-W2V/(x), 

for  x  >  0. 

Analogous  to  Definition  (2.2),  we  give  the  following  definition. 

Definition  2.4  (Method  of  Steepest  Feasible  Descent)  By  a  method  of 
steepest  feasible  descent  for  problem  (2.5),  we  mean  any  iterative  method 
of  the  form, 

xfc+l  =  xk  +  ak  dk ,  at  >  0, 

in  which  dk  is  steepest  feasible  descent  direction  for  /  at  x  as  described 
in  Definition  2.1. 

This  definition  ensures  that  the  iterates  remain  strictly  feasible.*  We  refer  to  a  prob¬ 
lem  of  the  form  (2.2)  as  a  steepest  descent  subproblem  for  problem  (2.1). 

2.3  Historical  Perspective 

2.3.1  Cauchy 

The  gradient  method  was  originally  proposed  by  Cauchy  [8]  in  1847  and  is  a  method 
of  steepest  decent  with  respect  to  the  Euclidean  norm.  Cauchy  considered  the  prob¬ 
lem  of  minimizing  a  function  of  several  variables.  Using  a  first-order  Taylor  series 

*In  accordance  with  contemporary  terminology,  such  a  method  could  be  called  and  interior  point 
method. 
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approximation  he  noted  that  taking  a  sufficiently  small  step  in  the  direction  of  the 
negative  gradient  would  guarantee  decrease  in  the  value  of  the  objective  function. 
Cauchy  chose  the  steplength  to  give  the  global  minimizer  in  the  negative  gradient 
direction,  i.e.  ak  solved 

minimize  f(xk  —  aV  f(xk)).  (2-8) 

a>0 

No  convergence  analysis  was  given  in  this  classical  paper.  Cauchy  simply  suggested 
that  since  the  function  value  would  decrease  at  each  step,  eventually  the  minimum 
would  be  achieved.  (Cauchy  made  the  remark  that  in  order  to  obtain  the  new  iterates 
quickly,  one  could  use  Newton’s  method  or  the  secant  method  on  the  one  dimensional 
minimization  subproblem  to  obtain  a  steplength.) 

2.3.2  Curry 

In  1944,  Curry  [11]  published  perhaps  the  first  convergence  result  for  the  gradient 
method  for  unconstrained  optimization.  For  continuously  differentiable  functions 
in  2Rn,  he  proved  that  with  the  proper  choice  of  steplength,  every  limit  point  of 
the  sequence  generated  by  the  gradient  method  is  a  stationary  point  of  /,  i.e.  the 
gradient  method  cannot  converge  to  a  point  that  is  not  a  stationary  point.  Curry  s 
choice  of  steplength  ak  was  the  first  stationary  point  of  problem  (2.8).  Curry’s  result 
also  holds  where  the  steplength  ak  is  the  first  local  minimizer  in  the  negative  gradient 
direction.  Byrd  and  Tapia  [7]  extended  Curry’s  theorem  to  arbitrary  choices  of  norm 
and  to  spaces  of  arbitrary  dimension. 

2.3.3  Rosen 

In  1957,  Rosen  extended  the  gradient  method  to  constrained  optimization.  His  gra¬ 
dient  projection  method  was  proposed  first  for  linearly  constrained  problems  [34,  35] 
and  then  extended  to  nonlinearly  constrained  problems  [36]  in  1961.  Rosen’s  gradient 
projection  method  is  based  on  a  projection  of  the  gradient  of  the  objective  function 
onto  a  subspace  of  the  domain,  where  the  subspace  is  defined  by  the  intersection  of 
hyperplanes  that  are  determined  by  the  active  constraints. 

Given  a  feasible  point  sc0,  Rosen’s  method  generates  a  sequence  of  the  form 

xk+l:=  xk  +  akPk(-Vf(xk))  (2.9) 

where  Pk  is  a  linear  Euclidean  norm  projection  operator.  The  steplength  is  taken 
to  be  the  minimum  between  the  value  for  which  a  new  inequality  constraint  becomes 
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active  and  the  value  which  minimizes  the  objective  function  in  the  current  direction. 
For  the  linearly  constrained  problem  (2.5),  When  the  constraint  matrix,  A,  has  full 
rank,  the  steepest  feasible  descent  with  respect  to  the  Euclidean  norm  is  given  by 
the  Euclidean  norm  projection  of  V f{x)  onto  the  null  space  of  the  constraint  matrix 
A.  This  follows  from  a  straightforward  application  of  the  second  order  necessary 
conditions  to  problem  (2.6). 

2.3.4  Goldstein  and  Levitin  &z  Poljak 

A  gradient  projection  method  for  convex  programming  in  a  Hilbert  space  setting  was 
proposed  by  Goldstein  [19,  20]  in  1964  and  independently  by  Levitin  and  Poljak  [27] 
in  1965.  The  method  computes  the  iterative  sequence  as  follows: 

xk+1  :=  Ps(xk-akVf(xk)),  (2.10) 

where  Ps  is  the  closest  point  projection  operator  for  the  Hilbert  space  and  S  is  the 
convex  feasible  region. 

Goldstein  proved  that  Curry’s  theorem  holds  under  the  assumptions:  (1)  the 
objective  function,  /,  is  twice  continuously  differentiable,  (2)  /  is  bounded  below 
on  the  convex  feasible  set  <S,  and  (3)  the  Hessian  of  /  is  uniformly  bounded  on 
S.  Levitin  and  Poljak  proved  Curry’s  theorem  holds  under  the  assumptions:  (1)  the 
Jacobian  of  /  is  uniformly  Lipschitz  continuous  on  the  feasible  region  <S,  and  (2) 
the  convex  feasible  region  is  a  bounded. 

2.3.5  McCormick  and  Tapia 

In  1972,  McCormick  and  Tapia  [29]  studied  Goldstein’s  gradient  projection  method 
for  a  general  objective  function.  They  proved  Curry’s  theorem  under  less  stringent 
assumptions  than  needed  by  Goldstein  and  Levitin  and  Poljak.  They  assumed  that 
(1)  the  objective  function  /  is  continuously  Frechet  differentiable  on  the  feasible 
region  S  and  (2)  the  feasible  region  S  is  closed  and  convex, 
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Chapter  3 

THE  CURSE  OF  SHORT  STEPS 


3.1  The  Phenomenon  of  Zigzagging 

It  is  natural  to  ask  whether  Curry’s  theorem  holds  for  Rosen’s  projected  gradient 
method  applied  to  as  simple  a  problem  as  (2.5),  i.e.  for  steepest  feasible  descent  with 
respect  to  the  Euclidean  norm. 


3.1.1  Zoutendijk 

Zoutendijk  [42]  recognized  that  most  feasible  direction  methods,  without  careful 
steplength  control,  may  converge  to  a  point  that  is  not  a  stationary  point.  He  pointed 
out  that  these  methods  have  the  potential  to  generate  iterates  that  bounce  between 
constraints  without  making  adequate  progress  on  the  minimization  problem.  In  re¬ 
quiring  the  iterates  to  be  feasible,  the  steplength  choice  often  emphasize  feasibility 
at  the  expense  of  function  decrease.  He  coined  the  term  zigzagging  to  describe  this 
phenomenon.  Zigzagging  occurs  when  the  steplength  is  determined  by  the  constraints 
rather  than  the  minimization  of  the  objective  function.  As  a  result,  zigzagging  can 
result  in  convergence  of  the  iterates  to  a  point  which  is  not  a  solution  to  the  mini¬ 
mization  problem. 


3.1.2  Wolfe’s  Example 


Wolfe  studied  the  behavior  of  Rosen’s  Gradient  Projection  method  for  a  special  case 
of  problem  (2.5)  where  the  only  constraints  were  nonnegativity  constraints  on  the 


variables: 

minimize  f(x) 
subject  to  x  >  0. 


(3.1) 


He  set  out  to  prove  that,  under  mild  conditions  on  the  objective  function,  the  gradient 
projection  method  would  converge.  In  fact,  he  was  able  to  construct  an  example  for 
which  Rosen’s  method  produced  a  sequence  of  points  that  converged  to  a  point,  x, 
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that  was  not  a  stationary  point.  (The  results  are  seen  graphically  in  Figure  3.1.*) 
Hence,  Rosen’s  gradient  projection  method  does  not  satisfy  Curry’s  theorem. 


y 


Figure  3.1  Wolfe’s  Zigzagging  Example 


3.2  McCormick’s  Anti-zigzagging  Strategy 

McCormick  recognized  that  in  Wolfe’s  example,  the  zigzagging  phenomenon  occurred 
because,  after  a  finite  number  of  iterations,  the  local  minimization  along  the  com¬ 
puted  step  direction  did  not  occur.  Instead,  the  steplength  was  based  entirely  on 
feasibility  considerations.  In  his  paper  [30]  ,  descriptively  entitled,  Anti-zigzagging 
by  Bending,”  McCormick  sought  to  modify  Rosen’s  method  so  that  longer  steps  would 
be  taken  at  each  iteration  and  thus  avoid  the  short  steps  associated  with  zigzagging. 
He  proposed,  for  problem  (3.1),  to  take  the  steepest  feasible  descent  direction  initially. 

* minimize |(x2  —  xy  +  y2)7,  subject  to  x,y,z  >  0. 
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However,  when  a  boundary  was  encountered,  instead  of  stopping,  the  step  direction 
vector  was  “bent”  to  follow  the  newly  encountered  constraint.  The  next  iterate  was 
chosen  to  minimize  the  objective  function  along  this  bent  vector.  Of  course,  several 
“bendings”  might  be  required.  This  approach  allowed  longer  steps  to  be  taken  and 
prevented  the  problem  of  the  steplength  being  dictated  by  the  constraints  and  not  the 
minimization.  McCormick  demonstrated  that  this  strategy  prevented  zigzagging,  i.e., 
he  proved  a  version  of  Curry’s  theorem.  McCormick  and  Tapia  [28]  noted  that  the 
“bending”  method  was  equivalent  to  Goldstein’s  gradient  projection  method  for  the 
special  case  where  the  feasible  region  is  the  nonnegative  orthant.  Then  they  extended 
Curry’s  theorem  to  the  general  gradient  projection  method  [29]. 

3.3  Observations 

Initially,  both  Goldstein’s  and  Rosen’s  methods  take  a  step  in  the  steepest  feasible 
descent  direction.  (See  the  corollary  to  Proposition  2  in  McCormick  and  Tapia  [29]). 
It  is  when  a  new  boundary  is  encountered  that  the  difference  occurs.  The  gradient 
projection  direction  adaptively  changes  as  it  meets  a  boundary,  while  the  projected 
gradient  method  stops  at  the  boundary.  It  is  this  seemingly  small  distinction  which 
allows  one  to  zigzag  while  the  other  cannot. 

These  observations  indicate  the  importance  of  considering  not  only  the  active  con¬ 
straints,  but  also  the  inactive  constraints  when  making  a  choice  of  direction  at  any 
iteration.  The  Euclidean  norm  steepest  feasible  descent  does  not  use  information 
about  the  inactive  constraints  in  determining  the  direction — it  only  considers  which 
direction  gives  the  greatest  amount  of  local  decrease.  In  choosing  the  norm  in  which 
decrease  is  measured,  we  believe  that  it  is  correct  to  include  information  about  dis¬ 
tance  from  the  inactive  constraints.  It  is  this  property  that  the  Dikin-Karmarkar 
norm  possesses  which  contributes  to  its  good  convergence  properties. 
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Chapter  4 

THE  DIKIN  &  KARMARKAR  ALGORITHMS 

4.1  Karmarkar’s  Algorithm 

In  1984,  Karmarkar  [23]  proposed  a  polynomial-time  method  for  the  solution  of  linear 
programming  problems  of  the  form 

minimize  c  Tx 

subject  to  Ax  =  0  , .  .  x 

r  ,  C4-1) 

e  x  —  1 

x  >  0, 

where  c,  s,  e  €  2Rn,  e  =  (1,  1,  . . . ,  1,  1)T,  A  €  JRmxn  is  of  full  row  rank,  and  the 
optimal  objective  function  value  is  zero. 


KARMARKAR’S  ALGORITHM:  Given  an  initial,  strictly  feasible 
point,  x°,  for  problem  (4.1)  and  a  tolerance  for  the  objective  function, 
e  >  0,  let  k  =  0. 

WHILE  cTxk>e  DO 

•  Dk*~  diag  ( xk) 

•  Compute  x  G  IRn  as  the  solution  to 


minimize  cTDx' 
subject  to  ADx'  —  0 
e  Tx'  =  1 

l|e-*'||a<« 


•  xk  •«—  Dx/e  TDx 

•  k  <—  k  +  1 
END  DO 


Theoretically  the  algorithm  was  appealing  because  it  was  a  polynomial-time  al¬ 
gorithm.  Karmarkar’s  algorithm  was  not  the  first  algorithm  for  linear  programming 
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to  have  a  theoretical  polynomial-time  bound.  In  1979,  Khachiyan  [24]  proposed  a 
modification  to  the  ellipsoid  method  which  led  to  the  first  polynomial-time  algo¬ 
rithm  for  linear  programming.  Unfortunately,  the  practical  performance  of  the  el¬ 
lipsoid  method  was  disappointing — it  was  not  competitive  with  the  simplex  method. 
However,  Karmarkar’s  method  was  practically  appealing  because,  in  some  cases,  its 
performance  did  rival  that  of  the  simplex  method.  The  approach  of  the  algorithm 
was  much  different  than  that  of  the  simplex  algorithm,  the  iterates  moving  through 
the  interior  of  the  feasible  region  rather  than  along  the  boundaries. 

While  Karmarkar’s  algorithm  is  not  a  straightforward  steepest  feasible  descent 
method  for  problem  (4.1),  the  subproblem  solved  at  each  iteration  has  the  form  of 
a  steepest  feasible  descent  subproblem  with  respect  to  a  weighted  Euclidean  norm. 
That  norm  is 

||  -  \\D  =  WD-1  ■  \\2,  D  =  diag(zfc),  (4.3) 

where  xk  is  the  current,  strictly  feasible  iterate.  It  has  been  shown  by  Morshedi  and 
Tapia  [31]  and  by  Tapia  and  Zhang  [38]  that  Karmarkar’s  algorithm  is  actually  a 
steepest  feasible  descent  method  applied  to  the  nonlinear  program  which  results  from 
a  simple  transformation  of  the  linear  program. 

4.2  The  Affine- Variant 

Subsequent  to  the  announcement  of  Karmarkar’s  algorithm,  researchers  considered 
modifications  to  the  algorithm.  Motivated  to  simplify  Karmarkar  s  algorithm  and  and 
to  develop  an  algorithm  that  gave  monotone  decrease  in  the  objective  function,  the 
affine-scaling  variant  was  introduced.  It  provided  a  simpler  scaling  of  the  problem, 
decrease  in  the  objective  function  at  each  iteration,  and  no  longer  required  that  the 
right  hand  side  of  the  linear  equality  constraints  be  zero  or  that  objective  function 
be  zero  at  the  solution. 

The  subproblem  that  is  solved  at  each  iteration  is 


minimize  c  T Dx 

(4.4) 

subject  to  ADx'  =  0 

(4.5) 

e  tDx'  =  1 

(4.6) 

||e  -  x'||2  <  S. 

(4.7) 
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We  can,  via  a  change  of  variables,  produce  an  equivalent  subproblem  where  the  the 
matrix  D  does  not  appear  in  (4.4),  (4.5),  and  (4.6)  and  (4.7)  becomes 

|| D~l(xc  -  x)||2  <  6. 

In  this  manner,  we  observe  that  the  affine  variant  can  be  viewed  as  a  method  of 
steepest  feasible  descent  with  respect  to  the  norm  (4.3). 

4.3  Dikin’s  Algorithm 

In  1967,  Dikin  [13]  considered  an  extension  of  the  method  of  steepest  descent  to 
linear  and  quadratic  programming  problems  with  inequality  constraints — specifically 
nonnegativity  of  the  variables.  He  proposed  an  iterative  method  to  solve  linear  pro¬ 
gramming  problems  of  the  form 

minimize  c  Tx 

subject  to  Ax  =  b  (4-8) 

x  >  0, 

where  c,x  €  1R",  b  e  IRm ,  and  A  €  2Rmxn  is  of  full  row  rank. 

DIKIN’S  ALGORITHM:  Given  an  initial,  strictly  feasible  point,  x°, 
for  problem  (4.8),  let  k  =  0. 

1.  Dk  <—  diag(xfe) 

2.  Compute  fj.k  €  Mm  as  the  solution  to 

n  m 

minimize  [  x)  (XXab74j  ~  ci)f  (4-9) 

j=i  i= l 

3.  6k  *—  A  V  -  c 

4.  4>k  *-  (xfc  T8k)2 

5.  WHILE  4>k  ^  0  DO 

•A  *  1/v^fc 

•  xfc+1  *—  xk  +  XkD2k8 

•  k  < —  k  -(-  1 

•  GO  TO  1 


END  DO 
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Note  that  problem  (4.9)  is  exactly  the  least  squares  problem, 

minimize  ||  Ak  n  —  c\\ 2,  (4-10) 

fieRm 

for  A  =  ADk  and  c.  =  Dkc.  When  A  has  full  row  rank,  the  unique  solution  to 
problem  (4.10)  is  given  by 

fik  =  (AD\AT)-' AD2kc.  (4.11) 

Therefore,  the  step  taken  at  each  iteration  is 

-  D2[I-  At(AD2At)~1AD2]c,  (4.12) 

which  is  exactly  the  steepest  feasible  descent  direction  given  in  (2.7). 

Dikin  [14]  proved  a  version  of  Curry’s  theorem  for  his  algorithm,  namely  that  any 
limit  point  of  the  iterative  sequence  is  a  solution  of  the  linear  programming  problem 
with  the  only  requirement  being  primal  nondegeneracy.  So  we  find  that  the  weighted 
Euclidean  norm  chosen  by  Dikin  and  Karmarkar  overcomes  the  zigzagging  problem 
associated  with  Euclidean  norm  steepest  feasible  descent.  We  will  refer  to  the  common 
norm  (4.3)  as  the  Dikin- Karmarkar  or  DK-norm. 

We  know  that  our  iterates  xk  may  have  some  components  that  axe  converging 
to  zero.  So  any  measurement  of  distance  should  be  a  relative  one  [38].  It  is  this 
relative  weighting  of  the  steps  that  allows  us  to  look  equally  at  components  that 
are  converging  to  zero  that  we  believe  contributes  to  making  the  Dikin-Karmarkar 
norm  an  ideal  choice.  We  refer  to  the  choice  of  the  Dikin-Karmarkar  norm  in  the 
context  of  steepest  feasible  descent  for  problems  with  nonnegativity  constraints  as 
the  Dikin-Karmarkar  principle. 
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Chapter  5 

THE  DIKIN-KARMARKAR  PRINCIPLE 

The  choice  of  norm  by  Dikin  and  Ivarmarkar  is  specifically  suited  to  problems  with 
nonnegativity  constraints.  For  this  reason,  as  we  look  at  the  role  of  the  norm  in 
this  context,  we  will  restrict  our  attention  to  linear  programming  problems  with 
nonnegativity  constraints  on  all  the  valuables: 

minimize  cTx  ^  ^ 

subject  to  x  >  0. 

We  are  specifically  interested  in  steepest  descent  directions  for  this  problem — with 
respect  to  the  Euclidean  norm  and  with  respect  to  the  Dikin-Karmarkar  norm. 

5.1  The  Choice  of  Norm 


Figure  5.1  The  Dikin-Karmarkar  and  Euclidean  Unit  Balls  in  IB? 

In  Figure  5.1,  we  illustrate  the  unit  balls  in  both  the  Euclidean  norm  and  the  Dikin- 
Karmarkar  norms  in  2R2.  The  geometry  of  the  Dikin-Karmarkar  unit  ball  changes 
based  on  the  distance  the  current  iterate  is  from  the  boundaries,  while  the  Euclidean 
ball  is  fixed,  regardless  of  the  boundaries. 
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5.2  The  Dikin-Karmarkar  Principle 

When  using  an  iterative  method  to  solve  problems  with  inequality  constraints,  it 
is  important  that  the  direction  chosen  at  each  iteration  take  into  account  all  the 
boundaries  of  the  feasible  region.  When  using  steepest  feasible  descent,  the  choice  of 
the  Euclidean  norm  ignores  the  boundaries  in  the  choice  of  direction  the  direction 
is  always  the  negative  gradient.  However,  the  Dikin-Karmarkar  norm  is  such  that  the 
distance  of  the  current  strictly  feasible  iterate  from  each  of  the  boundaries  is  taken 
into  account  in  the  norm  itself.  It  is  this  choice  of  norm,  in  the  context  of  solving 
problems  with  inequality  constraints,  that  we  call  the  Dikin-Karmarkar  Principle.  By 
taking  all  the  boundaries  into  account,  the  norm  allows  the  direction  taken  to  not  only 
focus  on  the  amount  of  local  decrease,  but  also  how  far  we  can  move  in  the  direction 
chosen  before  a  boundary  is  encountered.  We  believe  and  demonstrate  in  the  theory 
and  numerical  results  that  follow,  that  it  is  this  consideration  of  the  boundary  that 
results  in  steepest  feasible  descent  with  respect  to  the  Dikin-Karmarkar  norm  being  a 
more  effective  algorithm  than  steepest  feasible  descent  with  respect  to  the  Euclidean 

norm. 

5.3  Behavior  of  Steepest  Feasible  Descent  Near  the  Boundary 


y 
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Figure  5.2  Sequence  Converging  to  the  Boundary 

We  have  seen  in  Chapter  3  that  steepest  feasible  descent  methods  may  encounter 
problems  with  convergence  as  a  result  of  taking  short  steps.  In  particular,  zigzagging 
can  occur  as  steps  are  taken  toward  the  boundary.  However,  in  linear  programming 
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problems,  we  know  that  the  solution  lies  on  the  boundary;  so,  taking  such  steps  is 
necessary. 

We  gained  intuition  about  the  geometry  of  steepest  feasible  descent  with  respect 
to  both  norms  by  looking  at  what  directions  would  be  generated  by  the  algorithms 
at  each  point  of  a  sequence  converging  to  the  boundary.  In  Figure  5.2,  we  see  a 
particular  sequence  of  points  converging  to  the  boundary. 

We  choose  a  particular  linear  functional  c  Tx.  Figure  5.3  illustrates  the  directions 
generated  when  we  use  steepest  descent  with  respect  to  the  Euclidean  norm  at  each 
point  of  this  particular  sequence.  Figure  5.4  illustrates  the  directions  generated  when 
we  use  steepest  descent  with  respect  to  the  Dikin-Karmarkar  norm  at  each  point  in 

this  same  sequence. 

Note  that  the  directions  generated  using  the  Euclidean  norm  produce  relatively 
shorter  and  shorter  steps  to  the  closest  boundary;  while,  the  directions  generated 
using  the  Dikin-Karmarkar  norm  produce  relatively  longer  steps  to  the  boundary. 
These  observations  lead  us  to  examine  the  phenomenon  of  short  steps  in  steepest 

descent. 
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Figure  5.3  Steepest  Descent  Directions 
with  respect  to  the  Euclidean  Norm 
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y 


Figure  5.4  Steepest  Descent  Directions  with 
respect  to  the  Dikin-Karmarkar  Norm 
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Chapter  6 

THEORETICAL  RESULTS 

Before  a  formal  statement  of  theorems,  we  first  set  the  stage.  We  begin  by  restricting 
our  attention  to  linear  programming  problems  with  nonnegativity  constraints: 

minimize  cTx  (6  1) 

subject  to  x  >  0, 

where  c  >  0.  We  are  interested  in  well-posed  problems.  It  is  for  this  reason  that  we  are 
restricting  ourselves  to  problems  in  which  the  vector  c  is  strictly  positive,  otherwise 
problem  (6.1)  would  not  have  a  solution.  We  refer  to  such  linear  functionals,  where 
c  >  0,  as  valid  linear  functionals.  With  the  problem  we  are  addressing  now  clearly 
stated,  we  examine  short  steps  in  steepest  descent  methods  applied  to  this  problem. 

From  an  interior  point  x  >  0,  we  consider  the  direction  of  the  shortest  step  to 
the  boundary  of  {  x  \  x  >  0  }.  This  short  step  is  illustrated  for  1R2  in  Figure  6.1. 


X2 


X\ 


Figure  6.1  A  Short  Step  in  JR2 


Definition  6.1  (A  Shortest  Step  Direction)  Consider  a  point  x  >  0  . 
We  say  that  d  is  a  shortest  step  direction  from  x  if 

d  =  ocij , 
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where  a  >  0,  ej  is  the  jth  standard  basis  vector,  and  j  is  the  index  of 
the  smallest  component  of  x. 

We  would  like  to  stay  away  from  moving  in  a  shortest  step  direction  at  any  partic¬ 
ular  iteration;  in  fact,  we  wish  to  stay  away  from  a  neighborhood  of  such  undesirable 
directions.  These  directions  are  illustrated  for  IR3  in  Figure  6.2  and  are  defined  as 
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Figure  6.2 


e-Short  Step  Direction  in  2R3 


follows: 

Definition  6.2  (e-Short  Step  Direction)  Given  a  point  x  >  0,  let  d  =  e} 
be  a  shortest  step  direction  from  x.  Choose  fi  >  0,  so  that  y  —  x  +  fid 
is  on  the  boundary  J-j  =  {  z  *•  Zj  —  0  }.  For  e  >  0,  let 

ne  =  {ze^\\\z-yh<e}.  (6.2) 

We  say  that  any  s  €  Mn  is  an  e-short  step  direction  if 

x  +  as  €  (6-3) 

for  some  a  >  0. 

6.1  Tools  Necessary  for  Proof  of  Theorems 

We  wish  to  compare  steepest  feasible  descent  for  problem  (6.1),  with  respect  to  both 
the  Euclidean  norm  and  the  Dikin-Karmarkar  norm.  For  linear  programming  prob¬ 
lems  of  this  form,  it  is  impossible  to  say  that  at  a  particular  iteration  one  norm  choice 
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will  always  give  better  performance  than  the  other.  However,  if  we  look  at  all  valid 
linear  functionals,  what  can  we  say  about  how  these  two  choices  of  norm  will  effect 
our  performance  overall? 

At  a  given  point  x  >  0,  we  consider  the  proportion  of  all  valid  linear  functionals 
that  will  give  us  an  e-short  step  direction  when  we  use  a  method  of  steepest  descent, 
i.e.  we  want  to  find 

measure  of  valid  linear  functionals  giving  an  e-short  step  direction 
measure  of  valid  linear  functionals 

So  we  must  define  a  measure  for  the  set  of  linear  functionals. 

6.1.1  Parametrization  of  Linear  Functionals 


X2 


Figure  6.3  Parametrization  of  Linear  Functionals  in  JR 2 

We  begin  by  parametrizing  linear  functionals  in  terms  of  their  unit  normals, 
we  represent  a  particular  linear  functional  c  ^ x  by  c,  where 

5  =  TMb  ' 

This  leads  to  a  parameter  space  which  is  the  unit  sphere. 


Thus 

(6.4) 
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6.1.2  Measure  of  Linear  Functionals 

Thus  we  will  define  the  measure  of  a  particular  set  of  linear  functionals  to  be  the 
surface  area  of  the  portion  of  the  unit  sphere  which  represents  that  set  of  linear 
functionals.  For  the  set  of  linear  functionals  valid  for  problem  (6.1),  the  measure  is 
the  surface  area  of  the  unit  sphere  in  the  positive  orthant. 

The  surface  area  can  be  easily  be  computed  using  spherical  coordinates  and  inte¬ 
grating  over  the  representative  area  of  the  unit  sphere  in  IRn .  The  angles  that  will 
be  integrated  over,  will  be  taken  from 

0  <  Oi  <  7t/2 

rather  than  0  <  <  -k  since  that  valid  linear  functionals  lie  only  in  the  positive 

orthant.  Let  S(n )  denote  the  surface  area  of  the  unit  sphere  that  represents  the  valid 
linear  functionals,  then 

S(n)  =  £'2  {J  dVn.2)  d6u  (6.5) 

where  dVn-2  is  the  (n  —  2)-dimensional  volume  differential. 

We  will  denote  the  surface  area  of  the  valid  linear  functionals  for  which  the  steepest 
feasible  descent  direction  at  x  is  an  e-short  step  direction  by  St(x,n). 

Thus,  the  proportion  of  valid  linear  functionals  in  Mn  for  which  the  steepest 
descent  direction  at  x  >  0,  is  an  e-short  step  direction  is  given  by 

x  S^x^n)  ffi 

m'(s)  =  -SHP  (6-6) 


For  the  Euclidean  Norm 

The  surface  area  of  the  linear  functionals  that  will  produce  e-short  step  directions 
when  the  Euclidean  norm  is  chosen  can  be  seen  in  Figure  6.4.  Consider  a  point  x  >  0. 
Without  loss  of  generality,  let 

xn  =  min{  X{,  i  =  1, . . . ,  n  }. 

For  ease  of  notation,  we  will  let  r  =  xn.  View  the  x'  axis  as  representing  the  (n  -  1)- 
dimensional  surface  in  JRn  where  xn  =  0. 

The  surface  area  of  the  linear  functionals  for  which  the  steepest  descent  direction 
with  respect  to  the  Euclidean  norm  is  an  e-short  step  direction  can  be  computed  by 
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Xi 


Figure  6.4  Computing  the  Measure  for  the  Euclidean  Norm 


integrating  6n  from  0  to  9,  i.e. 

S«(x,n)  =  j'{JdVn.2}Ml  (6.7) 

=  e{JdVn-2).  (6.8) 

The  angle  6  is  determined  by  e  and  r: 

6  =  arctan  (e/x„).  (6-9) 

Thus,  for  x  >  0  for  which  x,-  =  min{ xJ?  j  =  the  proportion  we  are 

interested  in  is  given  by 

St(x,  n)  _  aictan  [tj Xj)  .g 

S(n)  tt/2 

For  the  Dikin-Karmarkar  Norm 

The  problem  that  we  need  to  solve  is  as  follows — given  a  strictly  positive  point  x  (E  IR 
and  e  >  0,  find  the  set  of  all  linear  functionals  for  which  steepest  descent  with  respect 
to  the  Dikin-Karmarkar  norm  will  produce  an  e-short  step  at  x. 
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Figure  6.5  Computing  the  Measure  for  the  Dikin-Karmarkar  Norm 


Without  loss  of  generality,  suppose  that 

xn  =  min{  Xj,  j  = 

Let  r  =  xn.  Consider  a  unit  sphere  centered  at  x.  We  make  a  change  of  coordinate 
systems  by  translating  the  entire  space  by  x  so  that  our  sphere  is  now  centered  at  the 
origin.  We  will  denote  all  points  x  €  IRn  as 

x  =  (x',xn), 

-  A 

where  x'  =  (xx,  x2,  ... , xn_i)  €  JR  .  Our  closest  face  Tn  is  now  the  surface  at  which 
xn  =  —r.  The  center  of  our  Qc  region  is  (O',  — r).  See  Figure  6.5. 

Every  e-short  step  direction  from  x  produced  by  steepest  descent  with  respect  to 
the  norm  can  be  written 

s  =  — D2c , 

where  D  =  diag(x).  Since  x  +  s  =  —  D2c  €  fL,  then  || (O',  — r)  —  D2c||2  <  t  and 
(D2c) n  =  r.  Thus 


H,  =  {(x',i„)  |  IID^x'H  <  e  and  xn  =  r}. 


(6.11) 
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where  D'  =  diag(xi,  x2, .  - . ,  £„_i).  The  linear  functionals  that  give  rise  to  f2e  are 
described  by 


F  =  {(x\  xn)  |  ||D/2x'jS  <  r2 e  and  xn  =  r  }. 
Let  (x',  -r)  e  Fx.  With  the  unit  sphere  expressed  as 

S=  {<(x',r)  |  ||(x,,r)||2  =  1},  t  = 


y/h 


/it? 


+  r' 


(6.12) 


(6.13) 


the  surface  of  the  unit  sphere  that  describes  the  set  of  all  linear  functionals  that  will 
give  rise  to  e— short  step  is 


X£  = 


(X'l  r) 


I  IlD'x'lb  <  r 


(6.14) 


We  make  the  change  of  variables  y 

=  x/\\x\W- 

v'  - 

x7 

(6.15) 

y  — 

fix'll*  +  r2 

Vn  = 

r 

(6.16) 

\/llx'  I2  +  f2 

From  (6.15),  (6.16),  and  since  ||y'||  +  yl  =  1,  we  have 


V'ThmF 

Thus  our  surface  of  interest,  (6.14),  can  be  described  by 


(6.17) 


Y  =  {y,|7raFl|DVIIiS£r2} 

Yn  =  |yn  =  y/ 1  “  lly'll2  }  • 

The  surface  increment  we  wish  to  integrate  over  is 


dS  -  dy'/yn 


(6.18) 

(6.19) 


(6.20) 


So  our  surface  area  is  given  by 


dy' _ 

V1  -  lly'll2' 


(6.21) 
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Now,  we  consider  Y  that  we  are  integrating  over.  From  (6.18)  we  have 


n— 1 

£ 

«=] 


r2e2 


Let 


A.2  = 


2,2  \ 

— Jyf  <  1. 

(6.22) 

d*  +  r2e2\ 
r2e2  ) 

(6.23) 

Note  that  A,  >  1. 

We  make  a  final  change  of  variables,  z  =  Ay',  where  A  =  diag  ( A,).  From  (6.21), 
our  surface  integral  is  now 

1  f _  dz 

A1A2  •  •  •  An_i  Jz  ji  _  ( zj/Ai )2  — 


yr -  (z1/\1y  -  (z2/\2y - (W An-02’ 

where  Z  =  {  ||  2;  ||2  <  1  }.  Note  that  the  integrand  is  bounded  on  Z. 

1  1 


(6.24) 


1  < 


Vl-IIA-'zll2  V 1  -  I! z' 


(6.25) 


So  (6.24)  can  be  bounded. 


6.1.3  Converging  Sequence  Described 

With  this  concept  of  how  to  measure  the  effect  of  a  particular  norm  choice,  we  again 
consider  a  sequence  of  points  {  xk  }  which  converge  to  the  boundary;  in  particular, 
we  look  at  {  xk  }  for  which 

,  _* 

X  — ►  X 

where 

for  some  1  <  j  <  n. 

6.2  Euclidean  Steepest  Feasible  Descent  Gives  Short  Steps 

We  consider  the  performance  of  steepest  descent  with  respect  to  the  Euclidean  norm 
and  the  Dikin-Karmarkar  norm  for  this  sequence  converging  to  the  boundary. 

Finally,  we  give  a  formal  statement  of  our  theoretical  results  for  steepest  feasible 

descent,  with  respect  to  the  Euclidean  norm,  for  problem  (6.1). 


x*  =  0,  i  =  j 

x i  >0,  i  ^  j. 


(6.26) 
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Theorem  6.1  Given  a  sequence  {  xk  }  ,  which  converges  to  a  point  x* 
satisfying  (6.26)  and  an  e  >  0,  for  steepest  descent  with  respect  to  the 
Euclidean  norm, 

lim  mc(xk,n)  =  1.  (6.27) 

k— >oo 

Therefore,  for  the  Euclidean  norm,  the  proportion  of  linear  functionals  for 
which  the  steepest  descent  direction  is  an  e-short  step  direction  is  one  in 
the  limit. 


Proof  Without  loss  of  generality,  assume  that  xk  — ►  x*  for  which  =  0.  From 
(6.7)  we  see  that 

6  —  arctan  (— ).  (6.28) 

x\ 

Letting  k  — >  oo, 

6  =  arctan  (— )  — *•  7r/2.  (6.29) 

xi 

So  from  (6.6)  and  (6.7),  we  see  that 


lim  mt(xfc,  n)  =  1.  (6.30) 

fc— KX> 

□ 


6.3  Dikin-Karmarkar  Steepest  Feasible  Descent  Avoids  Short 
Steps 

We  consider  the  same  sequence  {  xk  }  converging  to  a  point  on  the  boundary  and 
look  at  mt  for  the  Dikin-Karmarkar  norm. 

Theorem  6.2  Given  a  sequence  {  xk  }  ,  which  converges  to  a  point  x* 
satisfying  (6.26)  and  an  c  >  0,  for  steepest  descent  with  respect  to  the 
Dikin-Karmarkar  norm, 

lim  me(xk,n)  =  0.  (6.31) 

A:—*  co 

Therefore,  for  the  Dikin-Karmarkar  norm,  the  proportion  of  linear  func¬ 
tionals  for  which  the  steepest  descent  direction  is  an  e-short  step  direction 
is  zero  in  the  limit. 
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Proof  Consider  (6.24).  Note  that  the  integral  is  bounded  so  that 


0  <  m  <  I  — j ===== 
h  J\  -  (zx/Ax)2  - 


\J  1  —  ( z\ /Aj)2  —  (22/A2)'2  —  •  •  •  (zn_i/An_i)2 

However,  we  consider  the  quantity  multiplying  the  integral: 


<M  <00.  (6.32) 


"-1  /  r2e2 


AiA2---A„_i  \  Pi  \dt  +  r'2*-2, 


(6.33) 


So  for  xk  such  that  x*  -»  0,  letting  r  -»  0,  we  see  that  (6.33)  converges  to  zero.  □ 


Thus  we  find  that  as  our  iterates  approach  the  boundary,  we  are  assured  that  our 
iterates  will  be  bounded  away  from  a  region  of  short  steps. 
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Chapter  7 


NUMERICAL  RESULTS 


Our  theory  gives  an  explanation  of  the  behavior  of  steepest  feasible  descent  with 
respect  to  to  the  Euclidean  and  the  Dikin-Karmarkar  norms  for  the  simplified  case 
with  only  nonnegativity  constraints.  We  wanted  to  discover  whether  this  behavior 
extended  to  linear  programming  problems  with  linear  constraints  added.  We  found 
in  our  numerical  experimentation  that,  indeed,  the  behavior  described  in  our  theory 
occurred  in  this  more  general  case. 

In  our  numerical  testing,  steepest  feasible  descent  with  respect  to  the  Dikin- 
Karmarkar  norm  was  compared  to  steepest  feasible  descent  with  respect  to  the  Euclidean 
norm.  With  the  goal  of  discovering  how  this  choice  of  norm  in  a  steepest  feasible  de¬ 
scent  method  affected  length  of  the  step  to  the  boundary,  we  made  the  following 
comparisons.  For  each  linear  programming  problem  tested,  we  applied  the  steepest 
feasible  descent  method  as  described  in  Section  2.2,  for  both  the  Dikin-Karmarkar 
norm  and  the  Euclidean  norm.  The  steps  were  taken  a  fixed  fraction  of  the  distance 
to  the  boundary.  At  each  iteration,  a  comparison  was  made  of  the  length  of  the 
steepest  feasible  descent  step  to  the  boundary  for  the  solution  method  being  applied, 
and  length  of  the  steepest  feasible  descent  step  to  the  boundary  for  the  other  norm;  a 
comparison  was  also  made  between  the  amount  of  decrease  in  the  objective  function 
given  by  each  steepest  feasible  descent  step  to  the  boundary. 

The  tables  contain  the  following  notation  and  information.  The  step  taken  to 
the  boundary  in  the  steepest  feasible  descent  direction  with  respect  to  the  Euclidean 
norm  is  denoted  by  sg.  Likewise,  sjk  denotes  the  step  taken  to  the  boundary  in 
the  steepest  feasible  descent  direction  with  respect  to  the  Dikin-Karmarkar  norm. 
The  new  iterate  The  step  was  taken  a  fixed  fraction  (0<o;<l)of  the  distance 
to  the  boundary.  In  each  table,  the  first  column  gives  the  iteration  count  ITN.  The 
second  column  is  the  ratio  of  the  length  of  the  Dikin-Karmarkar  step,  to  the 

length  of  the  Euclidean  step,  sg.  Thus,  a  ratio  greater  than  one  indicates  that  the 
Dikin-Karmarkar  step  is  longer.  The  second  column  compares  the  amount  of  decrease 
in  the  objective  function  given  by  taking  the  Dikin-Karmarkar  step  to  the  amount 
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of  decrease  possible  by  taking  the  Euclidean  step.  Again,  a  value  greater  than  one 
indicates  that  the  Dikin-Karmarkar  step  provided  the  greater  decrease. 

Two  types  of  problems  were  used  in  our  testing.  First,  small,  randomly  gener¬ 
ated  problems  were  tested.  Second,  a  subset  of  the  Netlib  linear  programming  test 
problems  were  tested. 


7.1  Small  Dense  Problems 

The  random  problems  generated  had  from  3  to  10  variables.  The  linear  constraint 
matrices  were  full  rank  and  dense.  The  random  problems  tested  were  run  with 
the  steplength  parameter  a  varying  from  0.8  to  0.99.  There  was  not  a  signifi¬ 
cant  difference  in  the  results  for  the  different  parameter  values.  As  could  be  ex¬ 
pected,  with  a  smaller  steplength  the  number  of  iterations  was  slightly  greater  than 
with  a  longer  steplength.  The  stopping  criterion  utilized  was  that  the  relative  error, 
||  y  _  x*||2/||  x*||2  <  10"6.  Representative  results  are  for  five  problems  are  given  in 
Tables  7.1  through  7.5,  with  a  summary  in  Table  7.6.  In  Table  7.1,  we  see  that  for 


ITN 

C  TSdk/c  tse 

IMI/MI 

1 

1.4677 

2.5484 

2 

5.7968 

10.6109 

3 

2.7673 

5.8767 

4 

1.5890 

2.4118 

5 

2.7354 

4.9937 

6 

1.6597 

2.3401 

7 

3.3348 

6.8309 

8 

1.9356 

2.8371 

9 

1.7056 

3.4690 

10 

1.7358 

2.4666 

11 

3.4036 

7.0678 

12 

1.8937 

2.7170 

13 

1.8691 

3.8283 

Table  7.1  Comparison  DI<  step  and  Euclidean  step  for  RAND01 

RAND01,  at  every  iteration,  the  Dikin-Karmarkar  step  is  longer  and  gives  greater 
decrease.  Likewise,  for  RAND02  and  RAND03,  (Tables  7.2  and  7.3).  Note  that  in 
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ITN 

C  Tsdk/c  tse 

ll^fc||/llsEll 

1 

1.0217 

1.0287 

2 

6.2182 

32.2994 

3 

1.9711 

5.7909 

4 

1.7690 

4.5970 

5 

2.2801 

9.6942 

6 

1.1108 

1.2678 

Table  7.2  Comparison  DI<  step  and  Euclidean  step  for  RAND02 


RAND03,  at  iteration  7,  the  Dikin-Karmarkar  steplength  is  over  2,000  times  greater 
than  the  Euclidean  steplength  and  the  function  decrease  at  that  iteration  is  more 
that  150  times  greater.  In  both  RAND04  and  RAND05,  within  the  first  iterations, 
the  Euclidean  norm  step  is  longer  and  gives  greater  decrease,  but  as  the  solution 
(and  thus  the  boundary)  is  approached,  in  both  problems,  the  Dikin-Karmarkar  step 
becomes  longer  and  gives  greater  decrease. 

Table  7.6  gives  a  summary  of  these  five  problems.  The  first  two  columns  give 
problem  dimensions.  The  next  two  columns  give  the  average  function  decrease  ratios 
and  steplength  ratios  for  each  problem.  Note  that  on  all  problems,  on  the  average, 
the  Dikin-Karmarkar  step  was  longer  and  gave  greater  decrease. 

7.2  Netlib  Test  Problems 

A  subset  of  the  smaller  Netlib  linear  programming  test  set  was  tested.  The  problems 
are  large  and  sparse.  The  results  for  AFIRO  are  shown  in  Table  7.7  and  are  rep¬ 
resentative  of  that  obtained  for  this  test  set.  (For  this  particular  example,  the  step 
taken  was  0.9  of  the  distance  to  the  boundary.)  We  see  that  the  relative  decrease 
in  the  objective  function  is  superior  for  the  Dikin-Karmarkar  norm  and  the  lengths 
of  the  steps  that  can  be  taken  are  significantly  longer  than  those  for  the  Euclidean 
norm.  On  the  average,  the  amount  of  objective  function  decrease  possible  from  the 
Dikin-Karmarkar  step  was  more  that  22  times  that  possible  from  the  Euclidean  step; 
and  the  Dikin-Karmarkar  steplength  to  the  boundary  was  on  the  average  more  than 
300  times  that  of  the  Euclidean  steplength  to  the  boundary. 


ITN 

c  Tsdk/C  TSE 

IM/IM 

1 

3.3816 

5.1067 

2 

1.2347 

2.0131 

3 

5.6713 

35.6394 

4 

3.0492 

32.2240 

5 

2.1973 

5.0041 

6 

7.2994 

66.6234 

7 

156.2428 

2132.7605 

8 

16.1168 

217.8787 

9 

3.4182 

25.2341 

10 

2.9379 

7.6070 

11 

3.3591 

11.9794 

12 

4.8396 

15.6913 

13 

3.5877 

15.4935 

14 

3.6922 

9.7217 

15 

3.7458 

13.8619 

16 

4.5560 

15.0198 

17 

3.5331 

14.6335 

Table  7.3  Comparison  DK  step  and  Euclidean  step  for  RAND03 


ITN 

C  TSdk/c  TSe 

IMI/IM 

1 

0.9027 

0.9073 

2 

2.7223 

3.4063 

3 

0.9370 

0.9389 

4 

1.6543 

2.0211 

5 

1.2457 

1.3742 

6 

1.6943 

2.0838 

7 

1.2348 

1.3599 

8 

1.7011 

2.0941 

9 

1.2326 

1.3570 

10 

1.7024 

2.0962 

11 

1.2322 

1.3564 

Table  7.4  Comparison  DK  step  and  Euclidean  step  for  RAND04 
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ITN 

c  Tsdk/c  tse 

IMI/Hsfill 

1 

0.9134 

0.9523 

2 

2.6809 

6.9676 

3 

14.4369 

52.8578 

4 

3.6554 

60.7037 

5 

2.6453 

96.2155 

6 

2.1560 

13.9415 

7 

4.3167 

114.4900 

8 

1.7907 

11.9714 

9 

5.0727 

133.9721 

10 

1.9639 

12.1541 

11 

3.0697 

67.0537 

12 

1.9113 

22.2856 

13 

5.2271 

140.1575 

14 

2.1218 

12.7265 

Table  7.5  Comparison  DI<  step  and  Euclidean  step  for  RAND05 


PROBLEM 

NUMBER 

of 

VARIABLES 

NUMBER 

of 

CONSTRAINTS 

AVERAGE 

FUNCTION 

RATIO 

AVERAGE 

STEP 

RATIO 

RAND01 

9 

3 

2.6528 

4.8332 

RAND02 

5 

3 

1.8583 

5.8985 

RAND03 

9 

1 

13.2515 

153.1547 

RAND04 

6 

4 

1.4781 

1.4131 

RAND05 

10 

6 

3.7116 

52.4497 

Table  7.6  RANDOM  PROBLEM  SUMMARIES  (a  =  0.9) 
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ITN 

C  Tsdk/c  tse 

IMI/IM 

1 

11.5739 

175.4462 

2 

19.8760 

282.9671 

3 

24.3338 

356.0513 

4 

20.5237 

291.9174 

5 

37.8911 

567.2736 

6 

22.6763 

318.1621 

7 

19.0612 

275.0813 

8 

22.7428 

321.0754 

9 

24.5948 

356.2554 

AVG 

22.5860 

327.1366 

Table  7.7  Comparison  DK  step  and 
Euclidean  step  for  AFIRO;  (n  =  51;  m  =  27) 


35 


Chapter  8 

CONCLUDING  REMARKS 

We  have  developed  mathematical  theory  that  describes  both  the  asymptotic  short 
step  behavior  of  steepest  feasible  descent  with  respect  to  the  Euclidean  norm  and 
the  avoidance  of  short  steps  in  steepest  feasible  descent  with  respect  to  the  Dikin- 
Karmarkar  norm  as  the  boundary  is  approached.  This  theoretical  behavior  is  borne 
out  in  practice  on  problems  with  linear  equality  constraints  added. 

We  conjectured  that  if  information  about  all  the  boundaries  is  incorporated  into 
the  norm,  then  finding  such  a  norm  that  would  also  give  us  the  longest  step  possible, 
might  give  even  better  numerical  results  for  steepest  feasible  descent  than  with  the 
Dikin-Karmarkar  norm. 

As  we  developed  the  theory,  we  restricted  our  attention  to  problems  with  only 
nonnegativity  constraints: 

minimize  cTx  ^  ^ 

subject  to  x  >  0, 

where  c  >  0.  Observe  that  x*  =  0  solves  this  simple  problem.  Hence,  the  step  s 
that  would  solve  the  problem  in  one  iteration  from  a  strictly  feasible  point  x  would 
be  s  =  x  and  this  is  also  the  longest  step  that  can  be  taken  among  all  steps  that 
maintain  feasibility  and  give  descent.  Furthermore,  this  is  the  steepest  descent  step 
for  the  weighted  iCj0  norm: 

ll  •  IN  ip;1  •  IU,  (8-2) 

where  Dx  =  diag  ( x). 

We  might  expect  long  steps  and  good  convergence  behavior  if  we  were  to  use  a 
steepest  feasible  descent  method  with  respect  to  this  weighted  norm  to  solve  the  more 
general  problem  (8.3): 

minimize  c  T x 

subject  to  Ax  =  b  (8-3) 

x  >  0, 

where  c,  x  €  2Rn ,  A  €  iRmxn,  and  b  €  JR  • 
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However,  if  we  use  this  norm  in  a  steepest  feasible  descent  method  for  prob¬ 
lem  (8.3)  it  is  not  clear  how  the  steepest  descent  subproblem  should  be  solved.  (The 
obvious  approach  would  require  the  solution  of  a  linear  programming  problems  and 
this  would  not  lead  to  an  efficient  algorithm.)  We  therefore  restricted  our  attention 
to  weighted  Euclidean  norms  so  that  the  steepest  feasible  descent  direction  can  be 
computed  by  evaluating  a  linear  projection,  i.e.  solving  a  system  of  linear  equations. 
We  therefore  considered  a  weighted  Euclidean  norm.  Interestingly,  we  discovered  that 
it  was  possible  to  use  a  weighted  Euclidean  norm  that  would  give  us  the  same  ideal 
direction  as  our  weighted  infinity  norm  (S.2). 

We  consider  the  following  weighted  Euclidean  norm: 

n  •  n  s  iiM-"1  •  in.  <8-4) 

where 

W2  =  DXC~\  (8.5) 

for  C  —  diag  (  c )  and  Dx  =  diag  (  x  ).  The  vector,  -x,  is  a  steepest  feasible  descent 
direction  with  respect  to  this  norm.  In  other  words,  the  steepest  feasible  descent 
direction  will  reach  the  solution  in  one  step,  as  does  the  weighted  infinity  norm  (8.2). 
Utilizing  this  norm,  the  computational  effort  to  solve  the  steepest  feasible  descent 
subproblem  involves  a  matrix  factorization  versus  the  solution  to  a  complete  linear 
programming  problem  as  in  the  case  of  a  weighted  infinity  norm.  We  will  refer  to 
then  norm  satisfying  defined  by  (8.5)  as  the  long-step  norm.  It  is  clear  that  steepest 
feasible  descent  with  respect  to  the  long-step  norm  satisfies  Theorem  6.2. 

Using  the  same  set  of  test  problems  discussed  in  Chapter  7,  we  ran  steepest  feasible 
descent  with  respect  to  the  long-step  norm.  Comparisons  were  made  between  the 
behavior  of  steepest  feasible  descent  with  respect  to  the  long-step  norm  and  steepest 
feasible  descent  with  respect  to  the  Euclidean  norm;  and  between  the  behavior  for  the 
long-step  norm  and  the  Dikin-Karmarkar  norm.  As  expected,  when  comparison  was 
made  with  Euclidean  norm  steepest  feasible  descent  the  long-step  norm  gave  steps 
that  were  significantly  longer,  and  also  greater  decrease  in  the  objective  function  than 
was  possible  possible  by  using  the  Euclidean  norm.  However,  in  half  of  the  problems 
tested,  the  long-step  norm  steepest  feasible  descent  was  unable  to  converge  to  the 
solution. 

When  compared  to  steepest  feasible  descent  with  respect  to  the  Dikin-Karmarkar 
norm,  when  the  long-step  norm  steepest  feasible  descent  was  able  to  find  the  solution, 
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the  long-step  norm  took  longer  steps  and  had  greater  function  decrease.  However,  it 
took  on  the  average  46%  more  iterations. 

The  major  obstacle  that  the  implementation  of  steepest  feasible  descent  with 
respect  to  the  long-step  norm  encountered  was  that  the  weighting  matrix  (8.5)  tended 
to  become  numerically  singular  before  a  solution  could  be  found. 

Our  experience  with  this  long-step  norm  leads  us  to  believe  that  the  good  conver¬ 
gence  behavior  exhibited  when  using  the  Dikin-Karmarkar  norm  is  not  solely  due  to 
the  fact  that  it  takes  longer  steps  than  the  Euclidean  norm.  Neither  can  the  behavior 
be  attributed  to  only  the  fact  that  boundary  information  is  incorporated  into  the 
norm.  We  believe  that  an  important  factor  in  the  success  of  the  Dikin-Karmarkar 
norm  is  the  fact  that  all  components  are  scaled  uniformly,  including  those  components 
that  are  zero  at  the  solution  [3]. 
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